Joint Parsing and Alignment with Weakly Synchronized Grammars

نویسندگان

  • David Burkett
  • John Blitzer
  • Dan Klein
چکیده

Syntactic machine translation systems extract rules from bilingual, word-aligned, syntactically parsed text, but current systems for parsing and word alignment are at best cascaded and at worst totally independent of one another. This work presents a unified joint model for simultaneous parsing and word alignment. To flexibly model syntactic divergence, we develop a discriminative log-linear model over two parse trees and an ITG derivation which is encouraged but not forced to synchronize with the parses. Our model gives absolute improvements of 3.3 F1 for English parsing, 2.1 F1 for Chinese parsing, and 5.5 F1 for word alignment over each task’s independent baseline, giving the best reported results for both Chinese-English word alignment and joint parsing on the parallel portion of the Chinese treebank. We also show an improvement of 1.2 BLEU in downstream MT evaluation over basic HMM alignments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Alignment Elimination from Adams' Grammars

Adams’ extension of parsing expression grammars enables specifying indentation sensitivity using two non-standard grammar constructs — indentation by a binary relation and alignment. This paper proposes a step-by-step transformation of well-formed Adams’ grammars for elimination of the alignment construct from the grammar. The idea that alignment could be avoided was suggested by Adams but no p...

متن کامل

Dealing with Spurious Ambiguity in Learning ITG-based Word Alignment

Word alignment has an exponentially large search space, which often makes exact inference infeasible. Recent studies have shown that inversion transduction grammars are reasonable constraints for word alignment, and that the constrained space could be efficiently searched using synchronous parsing algorithms. However, spurious ambiguity may occur in synchronous parsing and cause problems in bot...

متن کامل

S4 enriched multimodal categorial grammars are context-free

Bar-Hillel et al. [1] prove that applicative categorial grammars weakly recognize the context-free languages. Buszkowski [2] proves that grammars based on the product-free fragment of the non-associative Lambek calculus NL recognize exactly the contextfree languages. Kandulski [7] furthers this result by proving that grammars based on NL also recognize exactly the context-free languages. Jäger ...

متن کامل

Deterministic Shift-Reduce Parsing for Unification-Based Grammars by Using Default Unification

Many parsing techniques including parameter estimation assume the use of a packed parse forest for efficient and accurate parsing. However, they have several inherent problems deriving from the restriction of locality in the packed parse forest. Deterministic parsing is one of solutions that can achieve simple and fast parsing without the mechanisms of the packed parse forest by accurately choo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010